The Best 201 Text Recognition Tools in 2025

Table Transformer Structure Recognition
MIT
A Table Transformer model trained on the PubTables1M dataset for extracting table structures from unstructured documents
Text Recognition Transformers
T
microsoft
1.2M
186
Trocr Small Handwritten
TrOCR is a Transformer-based optical character recognition model specifically designed for handwritten text image recognition.
Text Recognition Transformers
T
microsoft
517.96k
45
Table Transformer Structure Recognition V1.1 All
MIT
A Transformer-based model for table structure recognition, designed to detect table structures in documents
Text Recognition Transformers
T
microsoft
395.03k
70
Trocr Large Printed
Transformer-based optical character recognition model for single-line printed text recognition
Text Recognition Transformers
T
microsoft
295.59k
162
Texify
Texify is an OCR tool specifically designed to convert formula images and text into LaTeX format.
Text Recognition Transformers
T
vikp
206.53k
15
Trocr Base Printed
TrOCR is a Transformer-based optical character recognition model designed for single-line text image recognition, employing an encoder-decoder architecture
Text Recognition Transformers
T
microsoft
184.84k
169
Manga Ocr Base
Apache-2.0
An optical character recognition tool specifically designed for Japanese text, optimized for Japanese manga scenarios.
Text Recognition Transformers Japanese
M
kha-white
130.36k
145
Tiny Random Internvl2
Focus on extracting and converting text information from images into editable text content
Text Recognition Safetensors
T
katuni4ka
73.27k
0
Trocr Large Handwritten
TrOCR is a Transformer-based optical character recognition model specifically designed for handwritten text recognition, fine-tuned on the IAM dataset.
Text Recognition Transformers
T
microsoft
59.17k
115
Trocr Small Printed
TrOCR is a Transformer-based optical character recognition model designed for single-line text image OCR tasks.
Text Recognition Transformers
T
microsoft
20.88k
40
Lilt Roberta En Base
MIT
Language-independent Layout Transformer (LiLT) provides a LayoutLM-like model for any language by combining pre-trained RoBERTa (English) with a pre-trained language-independent layout transformer (LiLT).
Text Recognition Transformers
L
SCUT-DLVCLab
12.05k
19
CRAFT
CRAFT is a multilingual text detection model primarily used for detecting text regions in images, especially optimized for Persian text detection, but also supports other languages.
Text Recognition Supports Multiple Languages
C
hezarai
11.22k
6
PP OCRv5 Server Det
Apache-2.0
PP-OCRv5_server_det is the latest generation of text detection model developed by the PaddleOCR team. It is designed for high-performance application scenarios and supports the detection of text in various scenarios, including handwritten, vertical, rotated, and curved text. It can recognize multiple languages.
Text Recognition Supports Multiple Languages
P
PaddlePaddle
8,722
2
PP OCRv5 Server Rec
Apache-2.0
PP-OCRv5_server_rec is the latest generation of text line recognition model developed by the PaddleOCR team, supporting the recognition of multilingual and complex text scenarios.
Text Recognition Supports Multiple Languages
P
PaddlePaddle
8,601
0
Uvdoc
Apache-2.0
UVDoc is mainly used to perform geometric transformations on text images to correct problems such as distortion, tilt, and perspective distortion of documents in the images, thereby improving the accuracy of subsequent text recognition.
Text Recognition Supports Multiple Languages
U
PaddlePaddle
8,072
0
Trocr Base Handwritten Hist Swe 2
Apache-2.0
A historical handwriting recognition model jointly developed by the Swedish National Archives and other institutions, specifically designed for Swedish handwritten texts from 1600-1900.
Text Recognition Transformers Other
T
Riksarkivet
5,765
8
Pix2text Mfr
MIT
Pix2Text's Mathematical Formula Recognition (MFR) model, trained based on the TrOCR architecture, capable of converting mathematical formula images into LaTeX text representations.
Text Recognition Transformers
P
breezedeus
5,753
35
Mgp Str Base
MGP-STR is a pure vision-based scene text recognition model that achieves efficient OCR through multi-granularity prediction.
Text Recognition Transformers
M
alibaba-damo
4,981
64
Texteller
Apache-2.0
TexTeller is an end-to-end formula recognition model based on the ViT architecture, capable of recognizing mathematical formulas in natural images and converting them into LaTeX format.
Text Recognition Transformers
T
OleehyO
3,806
31
Trocr Large Stage1
TrOCR is a Transformer-based pre-trained model for Optical Character Recognition (OCR) tasks.
Text Recognition Transformers
T
microsoft
3,700
25
Crnn Base Fa V2
Apache-2.0
An OCR model specifically designed for Persian language, based on CNN+LSTM architecture, optimized for printed/scanned documents, supporting numeric and special character recognition.
Text Recognition Other
C
hezarai
3,096
6
Qari OCR 0.1 VL 2B Instruct
Apache-2.0
An Arabic OCR model fine-tuned based on Qwen2 VL model, optimized for full-page Arabic text recognition
Text Recognition Transformers Arabic
Q
NAMAA-Space
2,965
28
Crnn Fa Printed 96 Long
Apache-2.0
An OCR model optimized for Persian language, based on CNN+LSTM architecture, designed specifically for printed/scanned documents
Text Recognition Other
C
hezarai
2,886
5
Thai Trocr
Apache-2.0
A Thai and English optical character recognition model fine-tuned from the TrOCR base handwriting model, excelling in processing handwritten text line images
Text Recognition Transformers Supports Multiple Languages
T
openthaigpt
2,677
9
Magi
Comic Interpreter is an automatic transcription generation system capable of recognizing text and image elements in comics and generating corresponding transcriptions.
Text Recognition Transformers English
M
ragavsachdeva
2,575
44
Layoutlmv3 Finetuned Funsd
A document understanding model fine-tuned on the FUNSD dataset based on the LayoutLMv3-base model, excelling in token classification tasks for forms and documents
Text Recognition Transformers
L
nielsr
2,420
25
Ko Trocr
Apache-2.0
An OCR model supporting Korean initial sound recognition, using an improved tokenizer to address the traditional TrOCR's shortcomings in Korean initial sound recognition
Text Recognition Transformers Korean
K
ddobokki
2,035
28
Olmocr 7B Thai V1
olmOCR is an optical character recognition model fine-tuned based on Qwen2-VL-7B-Instruct. It focuses on converting image content such as PDFs into text and improves the recognition accuracy in specific scenarios through fine-tuning.
Text Recognition Safetensors Other
O
Adun
1,730
0
Table Transformer Structure Recognition V1.1 Pub
MIT
A table transformer model trained on the PubTables1M dataset for table structure recognition in documents.
Text Recognition Transformers
T
microsoft
1,634
4
Mlcd Vit Bigg Patch14 448
MIT
MLCD-ViT-bigG is an advanced Vision Transformer model enhanced with 2D Rotary Position Encoding (RoPE2D), excelling in document understanding and visual question answering tasks.
Text Recognition
M
DeepGlint-AI
1,517
3
Pix2text Mfd
MIT
Pix2Text's Mathematical Formula Detection (MFD) model for recognizing mathematical formulas in images
Text Recognition Other
P
breezedeus
1,369
3
Layoutlmv2 Finetuned Funsd
A document understanding model fine-tuned on the FUNSD dataset based on Microsoft's LayoutLMv2
Text Recognition Transformers
L
nielsr
1,319
13
PP DocLayout Plus L
Apache-2.0
PP-DocLayout_plus-L is a high-precision document layout area positioning model, trained based on the RT-DETR-L architecture, and supports the detection of 20 common document elements.
Text Recognition Supports Multiple Languages
P
PaddlePaddle
1,308
0
RT DETR L Wireless Table Cell Det
Apache-2.0
RT-DETR-L_wireless_table_cell_det is a high-precision table cell detection model designed specifically for table recognition tasks. It can accurately locate and mark each cell area in the table image.
Text Recognition Supports Multiple Languages
R
PaddlePaddle
1,144
0
RT DETR L Wired Table Cell Det
Apache-2.0
RT-DETR-L_wired_table_cell_det is a key module in the table recognition task, mainly responsible for locating and marking each cell area in the table image.
Text Recognition Supports Multiple Languages
R
PaddlePaddle
1,144
0
Slanext Wired
Apache-2.0
SLANeXt_wired is a deep learning model for table structure recognition, which can convert non - editable table images into editable table formats (such as HTML).
Text Recognition Supports Multiple Languages
S
PaddlePaddle
1,141
0
Pix2text Table Rec
MIT
A table structure recognition model developed based on Microsoft's Table Transformer for table detection and recognition tasks in documents
Text Recognition Transformers
P
breezedeus
1,124
2
Slanet Plus
Apache-2.0
SLANet_plus is a model for table structure recognition that can convert non-editable table images into editable table formats (such as HTML). It plays an important role in the table recognition system and can effectively improve the accuracy and efficiency of table recognition.
Text Recognition Supports Multiple Languages
S
PaddlePaddle
1,121
0
Textnet Base
TextNet is a lightweight and efficient architecture specifically designed for text detection, achieving an excellent balance between detection accuracy and inference speed through three variants.
Text Recognition Transformers
T
czczup
1,061
3
PP DocBlockLayout
Apache-2.0
PP-DocBlockLayout is a document layout block positioning model trained based on RT-DETR-L, which can effectively identify layout regions in various document types.
Text Recognition Supports Multiple Languages
P
PaddlePaddle
1,039
0
Qari OCR V0.3 VL 2B Instruct
Apache-2.0
QARI-OCR v0.3 is an optical character recognition vision-language model focused on Arabic structured document understanding. It is built on Qwen2-VL-2B-Instruct and excels at preserving document layout and format.
Text Recognition Transformers Arabic
Q
NAMAA-Space
1,016
2
PP OCRv4 Server Seal Det
Apache-2.0
The server-side seal text detection model of PP-OCRv4, with high accuracy, suitable for server deployment, and can effectively solve the problem of seal text detection.
Text Recognition Supports Multiple Languages
P
PaddlePaddle
1,013
0
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase